Using OLAP and Data Mining for Content Planning in Natural Language Generation

نویسندگان

  • Eloi L. Favero
  • Jacques Robin
چکیده

We present a new approach to content determination and content organization in the context of natural language generation for quantitative database summaries. Three key properties make our work innovative and interesting: (1) we developed a new text planning approach to deals with the content organization of a data set into a summary report, for example a Data Mining discovery; (2) the approach is domain independent; (3) it covers a significant class of database summary applications. 1 Research Context: Executive Summary Generation In this paper, we present a new approach for content determination and organization in quantitative data summarization. This approach has been developed for HYSSOP (HYpertext Summary System of On-line analytical Processing) which generates hypertext reports for OLAP summaries and Data Mining (DM) discoveries. HYSSOP is itself part of the Intelligent Decision-Support System called MATRIKS (Multidimensional Analysis and Textual Reporting for Insight Knowledge Search). MATRIKS aims to provide a comprehensive knowledge discovery environment through seamless integration of data warehousing, OLAP, DM, expert system and natural language generation technologies. The architecture of MATRIKS is given in Fig. 1. It extends previous cutting-edge environments for Knowledge Discovery in Databases (KDD) such as DBMiner [4] by the integration of: • a data warehouse hypercube exploration expert system allowing automation and expertise legacy of dimensional data warehouse exploration strategies developed by human data analyst using OLAP queries and data mining tools;

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Content aggregation in natural language hypertext summarization of OLAP and Data Mining Discoveries

We present a new approach to paratactic content aggregation in the context of generating hypertext summaries of OLAP and data mining discoveries. Two key properties make this approach innovative and interesting: (1) it encapsulates aggregation inside the sentence planning component, and (2) it relies on a domain independent algorithm working on a data structure that abstracts from lexical and s...

متن کامل

HYSSOP: Natural Language Generation Meets Knowledge Discovery in Databases

In this paper, we present HYSSOP, a system that generates natural language hypertext summaries of insights resulting from a knowledge discovery process. We discuss the synergy between the two technologies underlying HYSSOP: Natural Language Generation (NLG) and Knowledge Discovery in Databases (KDD). We first highlight the advantages of natural language hypertext as a summarization medium for K...

متن کامل

Order-Planning Neural Text Generation From Structured Data

Generating texts from structured data (e.g., a table) is important for various natural language processing tasks such as question answering and dialog systems. In recent studies, researchers use neural language models and encoder-decoder frameworks for table-to-text generation. However, these neural network-based approaches do not model the order of contents during text generation. When a human...

متن کامل

Corpus Generation and Analysis: Incorporating Audio Data Towards Curbing Missing Information

As video data becomes widely available, it is crucial that these videos are properly annotated for effective search, mining and retrieval purposes. Significant work has been done to explore natural language description as it can provide better understanding of the video content. Ideally, a summary should be informative and accurate in order for the users to have good understanding of the video ...

متن کامل

Combining Hierarchical Reinforcement Learning and Bayesian Networks for Natural Language Generation in Situated Dialogue

Language generators in situated domains face a number of content selection, utterance planning and surface realisation decisions, which can be strictly interdependent. We therefore propose to optimise these processes in a joint fashion using Hierarchical Reinforcement Learning. To this end, we induce a reward function for content selection and utterance planning from data using the PARADISE fra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000